AITopics | monte-carlo tree search

Monte-Carlo Tree Search for Constrained POMDPs

Neural Information Processing SystemsMar-16-2026, 20:55:22 GMT

Monte-Carlo Tree Search (MCTS) has been successfully applied to very large POMDPs, a standard model for stochastic sequential decision-making problems. However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural. The constrained POMDP (CPOMDP) is such a model that maximizes the reward while constraining the cost, extending the standard POMDP model. To date, solution methods for CPOMDPs assume an explicit model of the environment, and thus are hardly applicable to large-scale real-world problems. In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment. In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.65)

Add feedback

Monte-Carlo Tree Search for Constrained POMDPs

Jongmin Lee, Geon-hyeong Kim, Pascal Poupart, Kee-Eung Kim

Neural Information Processing SystemsFeb-12-2026, 20:45:58 GMT

Neural Information Processing Systems http://nips.cc/

cc-pomcp, cost constraint, cpomdp, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

PlanninginMarkovDecisionProcesseswith Gap-DependentSampleComplexity

Neural Information Processing SystemsFeb-7-2026, 11:33:10 GMT

This problem-dependent sample complexityresult is expressed in terms of the sub-optimality gapsof the state-action pairs that are visited during exploration.

artificial intelligence, nth, planning & scheduling, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)

Add feedback

Monte-Carlo Tree Search by Best Arm Identification

Neural Information Processing SystemsNov-21-2025, 12:12:19 GMT

Sequential identification questions in game trees with stochastic payoffs arise naturally as robust versions of bandit problems.

algorithm, artificial intelligence, planning & scheduling, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Monte-Carlo Tree Search for Constrained POMDPs

Neural Information Processing SystemsNov-20-2025, 22:13:14 GMT

Monte-Carlo Tree Search (MCTS) has been successfully applied to very large POMDPs, a standard model for stochastic sequential decision-making problems. However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural. The constrained POMDP (CPOMDP) is such a model that maximizes the reward while constraining the cost, extending the standard POMDP model. To date, solution methods for CPOMDPs assume an explicit model of the environment, and thus are hardly applicable to large-scale real-world problems. In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment. In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.

constrained pomdp, monte-carlo tree search, name change, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.65)

Add feedback

Monte-Carlo Tree Search for Constrained POMDPs

Jongmin Lee, Geon-hyeong Kim, Pascal Poupart, Kee-Eung Kim

Neural Information Processing SystemsNov-20-2025, 16:32:04 GMT

However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural.

artificial intelligence, machine learning, planning & scheduling, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

30de24287a6d8f07b37c716ad51623a7-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 14:32:05 GMT

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.15)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.70)

Add feedback

Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search

Neural Information Processing SystemsSep-30-2025, 12:11:01 GMT

Monte-Carlo tree search is drawing great interest in the domain of planning under uncertainty, particularly when little or no domain knowledge is available. One of the central problems is the trade-off between exploration and exploitation. In this paper we present a novel Bayesian mixture modelling and inference based Thompson sampling approach to addressing this dilemma. The proposed Dirichlet-NormalGamma MCTS (DNG-MCTS) algorithm represents the uncertainty of the accumulated reward for actions in the MCTS search tree as a mixture of Normal distributions and inferences on it in Bayesian settings by choosing conjugate priors in the form of combinations of Dirichlet and NormalGamma distributions. Thompson sampling is used to select the best action at each decision node. Experimental results show that our proposed algorithm has achieved the state-of-the-art comparing with popular UCT algorithm in the context of online planning for general Markov decision processes.

bayesian mixture modelling and inference, name change, thompson sampling, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.65)

Add feedback

Towards Explaining Monte-Carlo Tree Search by Using Its Enhancements

Kowalski, Jakub, Winands, Mark H. M., Wiśniewski, Maksymilian, Reda, Stanisław, Wilbik, Anna

arXiv.org Artificial IntelligenceJun-17-2025

--Typically, research on Explainable Artificial Intelligence (XAI) focuses on black-box models within the context of a general policy in a known, specific domain. This paper advocates for the need for knowledge-agnostic explainability applied to the subfield of XAI called Explainable Search, which focuses on explaining the choices made by intelligent search techniques. It proposes Monte-Carlo Tree Search (MCTS) enhancements as a solution to obtaining additional data and providing higher-quality explanations while remaining knowledge-free, and analyzes the most popular enhancements in terms of the specific types of explainability they introduce. So far, no other research has considered the explainability of MCTS enhancements. We present a proof-of-concept that demonstrates the advantages of utilizing enhancements.

artificial intelligence, enhancement, explanation, (14 more...)

arXiv.org Artificial Intelligence

2506.13223

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Europe > Netherlands > Limburg > Maastricht (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Generalized Proof-Number Monte-Carlo Tree Search

Kowalski, Jakub, Soemers, Dennis J. N. J., Kosakowski, Szymon, Winands, Mark H. M.

arXiv.org Artificial IntelligenceJun-17-2025

This paper presents Generalized Proof-Number Monte-Carlo Tree Search: a generalization of recently proposed combinations of Proof-Number Search (PNS) with Monte-Carlo Tree Search (MCTS), which use (dis)proof numbers to bias UCB1-based Selection strategies towards parts of the search that are expected to be easily (dis)proven. We propose three core modifications of prior combinations of PNS with MCTS. First, we track proof numbers per player. This reduces code complexity in the sense that we no longer need disproof numbers, and generalizes the technique to be applicable to games with more than two players. Second, we propose and extensively evaluate different methods of using proof numbers to bias the selection strategy, achieving strong performance with strategies that are simpler to implement and compute. Third, we merge our technique with Score Bounded MCTS, enabling the algorithm to prove and leverage upper and lower bounds on scores - as opposed to only proving wins or not-wins. Experiments demonstrate substantial performance increases, reaching the range of 80% for 8 out of the 11 tested board games.

artificial intelligence, node, proof number, (16 more...)

arXiv.org Artificial Intelligence

2506.13249

Country: Europe (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback